Dynamic Structures for Top- k Queries on Uncertain Data

نویسندگان

  • Jiang Chen
  • Ke Yi
چکیده

In an uncertain data set S = (S, p, f) where S is the ground set consisting of n elements, p : S → [0, 1] a probability function, and f : S → R a score function, each element i ∈ S with score f(i) appears independently with probability p(i). The top-k query on S asks for the set of k elements that has the maximum probability of appearing to be the k elements with the highest scores in a random instance of S. Computing the top-k answer on a fixed S is known to be easy. In this paper, we consider the dynamic problem, that is, how to maintain the top-k query answer when S changes, including element insertion and deletions in the ground set S, changes in the probability function p and the score function f . We present a fully dynamic data structure that handles an update in O(k log k log n) time, and answers a top-j query in O(log n+j) time for any j ≤ k. The structure has O(n) size and can be constructed in O(n log k) time. As a building block of our dynamic structure, we present an algorithm for the all-top-k problem, that is, computing the top-j answers for all j = 1, . . . , k, which may be of independent interest.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fully Dynamic Data Structure for Top-k Queries on Uncertain Data

Top-k queries allow end-users to focus on the most important (top-k) answers amongst those which satisfy the query. In traditional databases, a user defined score function assigns a score value to each tuple and a top-k query returns k tuples with the highest score. In uncertain database, top-k answer depends not only on the scores but also on the membership probabilities of tuples. Several top...

متن کامل

Ranking queries on uncertain data pdf

Top-k queries also known as ranking queries are often natural and useful in. Ing probabilistic threshold top-k queries on uncertain data.UNCERTAIN DATA MODELS W.R.T RANKING QUERIES. Uncertain attribute based on the associated discrete pdf and the choice is.observed, the semantics of top-k queries on uncertain data can be ambiguous due to tradeoffs. Whether it is better to report highly ranked i...

متن کامل

Top-k best probability queries and semantics ranking properties on probabilistic databases

There has been much interest in answering top-k queries on probabilistic data in various applications such as market analysis, personalised services, and decision making. In probabilistic relational databases, the most common problem in answering top-k queries (ranking queries) is selecting the top-k result based on scores and top-k probabilities. In this paper, we firstly propose novel answers...

متن کامل

Top-k Dominating Queries: a Survey

Top-k dominating queries combine the advantages of top-k queries and skyline queries, and eliminate their disadvantages. They return k objects with the highest domination score, which is defined as the number of dominated objects. As a top-k query, the user can bound the number of returned results through the parameter k, and like a skyline query a user-selected scoring function is not required...

متن کامل

Range Queries on Uncertain Data

Given a set P of n uncertain points on the real line, each represented by its one-dimensional probability density function, we consider the problem of building data structures on P to answer range queries of the following three types for any query interval I : (1) top-1 query: find the point in P that lies in I with the highest probability, (2) top-k query: given any integer k ≤ n as part of th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007